Thermal Bridges on Building Rooftops

Thermal Bridges on Building Rooftops (TBBR) is a multi-channel remote sensing dataset. It was recorded during six separate UAV fly-overs of the city center of Karlsruhe, Germany, and comprises a total of 926 high-resolution images with 6927 manually-provided thermal bridge annotations. Each image provides five channels: three color, one thermographic, and one computationally derived height map channel. The data is pre-split into training and test data subsets suitable for object detection and instance segmentation tasks. All data is organized and structured to comply with FAIR principles, i.e. being findable, accessible, interoperable, and reusable. It is publicly available and can be downloaded from the Zenodo data repository. This work provides a comprehensive data descriptor for the TBBR dataset to facilitate broad community uptake.


Background & Summary
About 30% of global final energy consumption and 27% of total energy sector emissions stem from building operations. After a short drop during the COVID-19 pandemic, emissions and energy consumption are both now above their pre-COVID level of 2019, showing that no late reduction trend has started 1 .
A major field for reducing energy consumption for building operations is the improvement of building envelopes, which is critical for reductions in heating and cooling intensity 2 . A thermal bridge is a discontinuity of a building's envelope, whose thermal properties differ fundamentally from the thermal properties of the adjacent enveloping surface 3 . With increasing demands on the quality of building envelopes, the minimization of thermal bridges is becoming ever more important, since losses from thermal bridges can account for up to one third of a building's transmission heat loss 4,5 . Beyond increased energy consumption, thermal bridges can lead to a wide range of problems, from the risk of condensation and mold infestation 6 , to a reduced comfort that occurs due to cold inner surfaces of a building 7 . In summer, thermal bridges lead to increased heat absorption by buildings and thus can increase the need for air conditioning 3 .
For the detection of thermal bridges of building envelopes, thermography can be reliably used 8 . In recent years, not only individual buildings, but also buildings in their urban context have gained importance for developing adequate retrofit strategies. The New Urban Agenda of the United Nations (UN) puts a spotlight on policies affecting urban structures at all appropriate levels recognizing that building design is one of the "greatest drivers of cost and resource efficiencies" 9 . When studying building stocks in cities, city districts, and villages, thermographic images can be collected with Unmanned Aerial Vehicles (UAVs/drones) 10,11 . Thermography with drones is especially advantageous because it saves time, resources, and is scalable for large areas compared to classical thermography with static cameras 10 . UAV-based thermographic systems are particularly beneficial when examining rooftops, since recordings with hand-held cameras are difficult. Previously, rooftop inspections with thermography had to be carried out on the basis of on-site inspections at night which are particularly labor-intensive, dangerous, and unable to achieve the same coverage feasible with drones 12 .
To evaluate large number of thermographic images collected in urban areas, the manual processing of images is time-consuming. The detection of thermal bridges can be automated, but is not trivial. Currently, approaches for automated thermal bridge detection work mostly with temperature threshold values and pattern recognition [13][14][15][16] . It is, however, difficult to find threshold values that can be generally applied to all types of thermal bridges 17 . Patterns and temperatures differ depending on the materials and building components where thermal bridges occur, on environmental conditions, and on recording settings. For example for windows, temperatures on thermographic images appear cooler due to high levels of reflection of glass surfaces 18 . Furthermore, misinterpretations, e.g. caused by open windows, can occur with simple threshold methods. Deep learning methods, which can overcome the aforementioned problems, may provide better results, but require annotated image datasets.
In this data descriptor, we present the Thermal Bridges on Building Rooftops (TBBR) dataset. To the best of our knowledge it is the first comprehensive aerial thermographic image dataset, which also provides height mapping information while also being fully annotated for district-scale segmentation of thermal bridges on building rooftops. It is organized and structured according to the FAIR principles 19 , i.e. being findable, accessible, interoperable and reusable.
The remainder of the data descriptor is organized as follows: the Methods section describes the environmental conditions and methodological approach in recording the TBBR dataset. Data Records details the organization of the data, including file formats, how the data has been preprocessed and curated, as well as how to obtain it from a publicly available data repository. In the Technical Validation section we highlight data quality aspects of TBBR. Finally, the Usage Notes sections sketches current and prospective use case scenarios for the data with an emphasis on (semi-)automated thermal bridge object detection and instance segmentation.

Methods
The raw images for our dataset were recorded with a Zenmuse XT2 visual (RGB) and a FLIR Tau 2 (thermal, https://flir.netx.net/file/asset/15598/original/) camera (see Table 1 for details) on a DJI M600 drone (https://www.dji.com/de/matrice600). They were recorded at flight heights between 60-80 m above ground with a flight speed of 1 m s and contain GPS information. The images cover six large blocks of around 20 buildings per block recorded in the city center of the German city Karlsruhe with a total fly-over area of roughly 48500 m 2 (see Fig. 1). Because of a high overlap rate of the images, the same buildings are on average recorded from different angles in different images about 20 times. All images were recorded during drone flights on Tuesday 19th March 2019 from 7am to 8am (UTC + 02:00). At this time, temperatures were between 3.78 °C and 4.97 °C, and humidity between 80% and 98%. There was no rain on the day of the flights, but there was .  Table 2. We do not provide information on the recorded buildings' internal temperatures, for estimates we refer readers to the corresponding German DIN standards 20 .
The full set of raw images captured contained a total of 5698 images before preselection 21 . Preselection involved the removal of all blurry images, e.g. due to rapid movement or turning of the drone, and all images containing no visible thermal bridges. After preselection a total of 926 images remained.
The RGB and thermal drone images were fused with a computed height map. All images were converted to a uniform format of 4000 × 3000 px, aligned, and cropped to 3370 × 2680 px to remove empty borders. The annotations only include thermal bridges that are visually identifiable with the human eye. Because of the aforementioned image overlap, each thermal bridge is annotated multiple times from different angles. For the annotation of the thermal images the image processing program VGG Image Annotator from the Visual Geometry Group, version 2.0.10 22 , was used. The thermal bridge annotations are outlined with polygon shapes. These polygon lines were placed as close as possible but outside the area of significant temperature increase. If a detected thermal bridge was partially covered by another building component located in the foreground, the thermal bridge was also marked across the covering in case of minor coverings. Adjacent thermal bridges, which affect different rooftop components, were annotated separately. For example, a window with poor insulation of the window reveal located in the area of a poorly insulated roof is annotated individually. There is no overlap between annotated areas. While each image contains annotations, they also include thermal bridges present that are not annotated due to not being clearly identifiable, e.g. too small for accurate identification or unclear due to the camera perspective.
Image preparation. The image registration and alignment procedure is shown in Fig. 2 Table 1. Technical specifications of the cameras used in recording the TBBR raw data. As the thermal camera is less than one year since purchase, it is still factory calibrated (see https://www.flir.co.uk/support-center/ surveillance/infrared-camera-calibration/).
www.nature.com/scientificdata www.nature.com/scientificdata/ The distortion correction procedure used was that established in previous works 23,24 . In short, a reference image was used to determine distortion coefficients, cv2.getOptimalNewCameraMatrix() to find a new camera matrix, and cv2.undistort() to correct distortion. All mentioned processing functions are part of the computer vision programming library OpenCV 25 .
Image registration and alignment was then performed by transforming the RGB and height map images onto the thermal images, as the annotation of thermal bridges was performed on these. A homography matrix was calculated using a total of 316 coordinate pairs from 21 RGB and thermal images. This homography matrix was then used to transform all RGB images in the dataset. Since the height map was created from the RGB images, we also used this homography matrix to transform the height map images.
The final cropping and stacking was performed to create the 5-channel images of the TBBR dataset, output in the NumPy format 26 . Images are cropped to 3370 × 2680 px to remove large black borders present in thermal images, and subsequently stacked into the channel order [B, G, R, Thermal, Height]. computation of the height map. Due to the high overlap of images, we can extract similarities from feature points identified in each image and conduct photogrammetry. Photogrammetry allows estimation of the three-dimensional coordinates of points on an object in a generated 3D space involving measurements made on images taken with a high overlap rate. Therefore, we can use this technique to create a 3D point cloud model of the recorded region.
We used the ContextCapture software to perform photogrammetry on the TBBR dataset. ContextCapture provides users with intermediate information necessary to obtain each image's estimated 3D coordinates and orientation 23,24 . This information allowed estimation of the distance between points in 3D and 2D spaces and to project points from the 3D to the 2D space to generate the height maps. The resulting 2D height map image pixels show the z-axis value (vertical height) of the corresponding 3D point cloud model points, normalized to the 8-bit range of the lowest 3D model point (0) and the drone (255).

Data Records
The Thermal Bridges on Building Rooftops (TBBR) data is publicly available on Zenodo 27 Fig. 3. Archives were compressed using ZStandard compression 28 . They can be decompressed by utility software programs, e.g. tar or unzstd. Corresponding annotations are provided in the COCO JSON format 29 , which were automatically generated by the VGG Image Annotator.
One of TBBR's main design objectives was to facilitate (semi-)automated thermal bridges pattern detection algorithms 30 (see Usage Notes). In accordance, the data is pre-split into train and test subsets with 723 (5614) and 203 (1313) images (annotations), respectively. There is one annotation COCO JSON for each subset, i.e. one for training (Flug1_100Media to Flug1_104Media) and one for test (Flug1_105Media) data. The latter block is used as a hold-out test dataset to standardize out-of-sample generalization performance assessment.
The experimental metadata was structured with the Spatio Temporal Asset Catalog (STAC) (https://stacspec. org/en) specification family. This specification is used to provide a standardized way for describing geo-spatial assets. It defines related JSON object types of Item, Catalog, and Catalog, extending Collection as the basis. Moreover, STAC objects can be extended with other specifications and enable a mechanism to provide additional metadata. Such an approach addresses the relevance for a common understanding of experimental metadata, which is ideally a widely accepted standard 31 .
The STAC Collection JSON object Flug1_collection_stac_spec provides information about the recorded images and the environmental conditions during recordings. It also contains information about the overall bounding box of the entire area in which images were recorded. It links to related STAC Item JSON objects containing information about the recorded city blocks and the cameras. The objects for the six flight paths, i.e. Flug1_100_stac_spec, Flug1_101_stac_spec, Flug1_102_stac_spec, Flug1_103_stac_spec, Flug1_104_stac_spec, Flug1_105_stac_spec, contain the GeoJSON 32 geometry of the respective block and the corresponding bounding box.
The objects containing the camera information, named Flug1_camera1_stac-spec for the RGB camera and Flug1_camera2_stac-spec for the Thermal camera, are based on an existing STAC extension for camera related metadata. All STAC Item objects have a link to the Flug1_collection_stac_spec Collection object.
Metadata of the archived NumPy files for each image was structured using the Data Package schema from the Frictionless Standards (https://specs.frictionlessdata.io). This standard describes a collection of data files.  www.nature.com/scientificdata www.nature.com/scientificdata/ Therefore, metadata about all containerized NumPy files of the six flight paths is provided within a JSON-based file, named Flug1_100-105_frictionless_standards.
All files are represented in a standardized way as FAIR Digital Objects (FAIR DOs) to enable machine actionable decisions on the data in the spirit of the FAIR principles 33 . This representation further facilitates reproducibility of experiments performed using TBBR and the detection of data errors 34 . Thus, each file deposited in Zenodo (https://doi.org/10.5281/zenodo.7022736) 27 was assigned a Persistent Identifier (PID), which is resolvable with the Handle.Net Registry (HNR) (https://www.handle.net/). The full list of PIDs are listed in the TBBR Zenodo dataset description 27 .

technical Validation
The visual identification process and description of thermal bridges on building rooftops was based on typical patterns described in German DIN standards [35][36][37] and thermal infrared inspections 38 . We note, however, that the interpretation of thermal images for building audits is currently always performed by human operators, which involves a high level of subjectivity 13 .
Thermal bridges occur on different parts of rooftops. Table 3 provides an overview about the different roof types and rooftop components where thermal bridges were annotated.
All preselected images were first manually annotated by a single industrial engineer. Then, following the two-person principle, all annotations were subsequently reviewed independently by an expert supervisor and corrected when necessary.
We qualitatively compare the distributions of thermal and height map values of thermal bridges and background between the train and test subsets. Figure 4 shows the histograms of both distributions within their 8-bit channel ranges of [0,255]. As expected, we observe a uniform distribution of thermal values across background pixels, while there is a distinct peak in warmer pixels for thermal bridges. Similarly, we see the presence of thermal bridges on rooftops only being reflected in the large height map values of thermal bridges, while background pixels are distributed uniformly both at the building level, and to a lesser extent at street level.
To quantitatively compare annotated distributions, we use scale invariant feature transform (SIFT) descriptors 39 which has been shown to have a good general robustness across a range of image transformations 40 , e.g. affine transformations, scale changes, and rotations, making it an appropriate comparison for thermal bridge images of rooftops from various distances and angles. Figure 5 shows the average Euclidean distances between all 128 SIFT descriptors for annotated thermal bridges and background pixels across the train and test subsets. We observe a small distance between like classes across both train and test subsets, and larger relative distances for unlike classes, indicating that annotated regions contain distinct features from background in a consistent manner.

Usage Notes
The annotation files contain relative paths to the NumPy files. We recommend the folder structure shown in Fig. 6 for usage of TBBR in conjunction with computer vision libraries such as Detectron2 41 or MMDetection 42 , or with the provided TBBRDet library (see Code Availability).
For image analysis pipelines we recommend to standardize the images, i.e. center it to 0 mean with a standard deviation of 1, to make the different channel ranges of the image data comparable:     www.nature.com/scientificdata www.nature.com/scientificdata/ utilize the PyTorch (v.1.10.2) machine learning framework 44 .
Conceptually, the software provides the following functionalities: VGG annotation to COCO JSON converter implementing fully automatic conversion from the annotation format generated during the manual labeling process into the COCO JSON format archived on Zenodo.
Dataset mappers for the Detectron2 and MMDetection libraries implementing random-access collections to individual images and corresponding annotations. These are necessary for enabling the loading of five-channel images in each library. Data may be augmented by arbitrary transformations during the loading procedure.
Model configuration for all Detectron2 and MMDetection experiments performed in related works. Training/evaluation scripts for performing training and evaluation of neural networks for both Detectron2 and MMDetection.
Dataset/experiment utilities for exploring the dataset, calculating image normalization coefficients, combining model scores, and calculating SLURM workload manager system 45 statistics (consumed energy, runtime, etc.).
For creating, updating, and validating the FAIR DOs, the Typed PID Maker was used. This is a component of the FAIR DO Lab for working on FAIR DO tasks, which is found at https://github.com/kit-data-manager/ FAIR-DO-Lab.